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Genomic Sequencing of the Severe Acute Respiratory 
Syndrome-Coronavirus 
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Summary 


The polymerase chain reaction (PCR), which can exponentially replicate a target 
DNA sequence, has formed the basis for the sensitive and direct examination of clinical 
samples for evidence of infection. During the epidemic of severe acute respiratory syn- 
drome (SARS) in 2003, PCR not only offered a rapid way to diagnose SARS-coronavirus 
(SARS-CoV) infection, but also made the molecular analysis of its genomic sequence 
possible. Sequence variations were observed in the SAR-CoV obtained from different 
patients in this epidemic. These unique viral genetic signatures can be applied as a pow- 
erful molecular tool in tracing the route of transmission and in studying the genome 
evolution of SARS-CoV. To extract this wealth of information from the limited primary 
clinical specimens of SARS patients, we were presented with the challenge of efficiently 
amplifying fragments of the SARS-CoV genome for analysis. In this chapter, we will 
discuss how we managed to accomplish this task with our optimized protocols on reverse- 
transcription, nested PCR amplification, and DNA cycle sequencing. We will also dis- 
cuss the sequence variations that typified some strains of SARS-CoV in the different 
phases during this epidemic. PCR amplification of the viral sequence and genomic 
sequencing of these critical sequence variations of re-emerging SARS-CoV strains would 
give us quick insights into the virus. 


Key Words: SARS coronavirus; viral RNA extraction; reverse-transcription PCR; 
sequencing; genomic sequence variation. 


1. Introduction 


Severe acute respiratory syndrome-coronavirus (SARS-CoV), the etiologic 
agent of SARS (J-3), is a virus that was unknown to us before the SARS 
epidemic. The concerted efforts of researchers have promptly elucidated its 
genetic code. The genome of SARS-CoV 1s a 29,727-nucleotide, polyadenyl- 
ated RNA. The genomic organization is typical of coronaviruses, having the 
characteristic gene order (5'-polymerase |Orflab], spike [|S], envelope [E], mem- 
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brane [|M], and nucleocapsid |N]-3') and short untranslated regions at both 
termini (4,5). 

With this sequence information, rapid PCR-based molecular diagnostic tests 
of SARS-CoV infection were designed (/,6—10). Besides offering molecular 
diagnosis and quantitative measurement of viral load, PCR-based technologies 
have also been exploited to amplify the genomic fragments of SARS-CoV for 
sequence analysis. The high sensitivity and specificity of PCR has made this 
genomic sequence analysis possible even for uncultured clinical specimens. 
Unlike the conventional microbiological methods, PCR-based technologies 
may not require viral culture, which could introduce culture-derived artifacts 
in the genomic sequence. The specific PCR primers selectively amplify SARS- 
CoV sequences from the background of other nucleic acid sequences contrib- 
uted by the patient or other microbes. Moreover, the PCR-based method is 
versatile in terms of the type of clinical specimens. In our hands, we have suc- 
cessfully analyzed the SARS-CoV genome directly from uncultured samples 
of serum, nasopharyngeal aspirate, and stools (17). This obviates any concern 
about the poor or even unsuccessful viral culture of the precious clinical speci- 
mens. The risk in handling large-volume and hazardous viral culture could 
also be avoided. 

Genomic sequence variations were observed in the SARS-CoV obtained 
from different patients in this epidemic. Based on these sequence variations, 
most of the isolates are typified by two groups: isolates obtained from patients 
who were epidemiologically linked to the Metropole Hotel in Hong Kong, and 
those who were not (3,12,13). For example, there are seven sequence varia- 
tions that can distinguish isolate CUHK-Su10, which is linked to the Metropole 
Hotel, from isolate CUHK-W1, which is not linked to this hotel case cluster 
(Table 1). Among them, four variations at nucleotide positions 17564, 21721, 
22222, and 27827 (according to the Tor2 sequence in GenBank, accession no. 
AY274119 [5]) were suggested by The Chinese SARS molecular epidemiol- 
ogy consortium (/4) as part of a haplotype configuration that marks the differ- 
ent phases of a tri-phasic SARS epidemic in Guangdong Province of China. 
CUHK-W1 carried a haplotype G:A:C:C that typified the middle phase. Nota- 
bly, the same haplotype was observed in CUHK-L2, which was one of the 
earliest confirmed case of SARS in Hong Kong, having been documented even 
before any report of the hotel case cluster (75). CUHK-Su10 carried a haplo- 
type T:T:T:T that typified the late phase, marked by the hotel case cluster that 
spread the virus to many other parts of the world. 

Genomic sequence variations in SARS-CoV have also revealed the route of 
infection from within communities and across cities. For instance, compared 
with isolate CUHK-Su10, two mutations, T3852C and C11493T, first appeared 
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Table 1 

Comparison of the Sequences of Two Strains of Severe Acute 
Respiratory Syndrome (SARS)-Coronavirus Isolated From 
Patients in Hong Kong at the Beginning of the Epidemic? 


Nucleotide position CUHK-Su10 CUHK-W1 
9404 T C 
9479 T C 
17564" T G 
19064 A G 
2121" G A 
22227" fi C 
21821" fi C 


“Sequence variations at seven positions between the two viral strains 
(CUHK-Sul10 and CUHK-W1) are indicated. The nucleotide positions are 
numbered according to the sequence of GenBank accession number 
AY274119. 

Part of the haplotype suggested by The Chinese SARS molecular epide- 
miology consortium for distinguishing the early, middle, and late phase of 
the SARS epidemic in 2003. 


in isolates CUHK-AGO1, CUHK-AGO02, CUHK-AG03 (GenBank accession 
numbers AY345986, AY345987, AY345988) obtained from patients involved 
in the Amoy Gardens outbreak in Hong Kong (JJ). Later, these two genetic 
fingerprints appeared in 10 completely sequenced Taiwanese isolates (16). 

Interestingly, toward the end of the epidemic, another type of fingerprint 
was found by PCR-based method. A variant of the SARS-CoV with a 386- 
nucleotide deletion was reported in a cluster of patients that seem to be epide- 
miologically related (17). Most of the cases were part of a documented outbreak 
in the North District Hospital in Hong Kong. 

We have illustrated that sequence variations among different isolates have a 
remarkable epidemiological correlation. Thus, PCR amplification followed by 
sequencing is a powerful tool in tracing the route of transmission. The sequence 
information may provide objective support to epidemiological investigations. 
Moreover, in the event that SARS re-emerges, one could quickly gain impor- 
tant insight into the origin and evolution status of the SARS-CoV simply by 
sequencing the critical sequence variations of the genome, as exemplified here. 
However, to extract this wealth of information from the limited primary clini- 
cal SARS specimens, we need very sensitive and efficient protocols to effi- 
ciently amplify fragments of the SARS-CoV genome for analysis. 
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2. Materials 
2.1. RNA Extraction 


1. QIAamp viral RNA Mini Kit (Qiagen, Hilden, Germany). 
2. Absolute ethanol. 


2.2. Reverse-Transcription 


1. Superscript III RNase H~ reverse transcriptase (Invitrogen, Carlsbad, CA). 

2. Random hexamers (Applied Biosystems, Foster City, CA). 

3. 5X First-strand synthesis buffer: 250 mM Tris-HCl, pH 8.3, 375 mM KCI, 15 mM 
MeCl, (Invitrogen). 

RNasin RNase inhibitor (Promega, Madison, WI). 

dNTP (Invitrogen). 

0.1 M Dithiothreitol (DTT). 

RNase-free water (Promega). 
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2.3. PCR Amplification 


1. Advantage cDNA Polymerase mix and buffer (BD Biosciences Clontech, 
Palo Alto, CA). 

2. dNTP. 

3. PCR primers. 

4. Distilled water. 


2.4. Genomic Sequencing 


1. BigDye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems). 

2. ABI Prism 3100 Genetic Analyzer (Applied Biosystems). 

3. DYEnamic ET Dye Terminator Kit (GE Healtcare-Biosciences, Little Chalfont, UK). 
4. MegaBACE 1000 Sequencing System (GE Healthcare-Biosciences). 


2.5. Sequence Analysis and Comparison 


SeqScape software (Applied Biosystems). 


3. Methods 
3.1. Precautions Against Potential Contamination 


Genomic sequencing involves PCR amplification, which produces numer- 
ous copies of the target DNA, and cycle sequencing, which requires the 
pipetting and manipulation of PCR products. These steps could easily con- 
taminate the laboratory environment with amplified products. Such contami- 
nation problems would affect the interpretation of sequencing results, and 
adversely affect the performance of diagnostic tests designed to detect the same 
viral sequences. Hence, extreme care should be taken to avoid contamination. 
We suggest the following precautions: 
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Perform RNA extraction, PCR amplification, and genome sequencing in different 
laboratories, or at least in separate and dedicated compartments of the same labo- 
ratory. 

Transfer reagents and samples only with aerosol-resistant pipet tips. 

Prepare the PCR reagent master mix in a hood dedicated for this purpose. A set of 
clean gloves and dedicated lab gown should be worn in this area. [lluminate the 
hood with ultraviolet before and after use. 

Any steps that involve the handling of cDNA, primary and secondary PCR 
products (including addition of DNA templates in assembling the PCR), electro- 
phoresis, and cycle sequencing should be performed in a dedicated area far away 
from any PCR reagents. A separate lab gown and set of gloves should be worn in 
this area. 


. Discard all pipet tips that contacted DNA with extreme care. Use a double bag 


for disposal. 
Include multiple negative PCR controls in each amplification to monitor for 
environmental contamination. 


. RNA Extraction 
. Prepare AVL lysis buffer and AW1 and AW2 wash buffers according to 


manufacturer’s (Qiagen) instructions (see Note 1). 

In a biosafety level 2 (or above) containment laboratory, lyse 0.28 mL (1 vol) of 
viral culture by adding 1.12 mL (4 vol) of AVL buffer, mixing and incubating at 
room temperature for 10 min. Direct clinical samples, e.g., serum, nasopharyn- 
geal aspirates, and stools, can also be used (see Note 2). 

Add 1.12 mL of absolute ethanol to the mixture. Pulse-vortex for 15 s. 

Load the mixture to QJAamp spin column and wash the column according to the 
manufacturer’s instructions. 

Add 60 uL of RNase-free water onto the membrane and incubate for 1 min at 
room temperature. Centrifuge the spin column for 1 min at 6000g. 

Quantify a small aliquot of the extracted viral RNA yield by real-time quantita- 
tive reverse-transcription (RT)-PCR (9) (see Note 3). 

Store the extracted RNA at —80°C. 


. Reverse-Transcription 


Prewarm two thermocycler blocks with heated lid at 72 and 25°C, respectively. 


2. Mix | wl (50 pmol) random hexamer with 10 UL RNA in a 0.5-mL tube. Dena- 


ture at 72°C for 10 min (see Note 4). 

During this period, assemble the reaction mix in another tube on ice according to 
Table 2 using SuperScript II RNase H- Reverse Transcriptase (see Note 5). 
After denaturation, snap-cool the RNA-primer mixture on ice for | min. Briefly 
spin the tubes. Add the reaction mix prepared in step 2 to the RNA-primer mix- 
ture to make up a total reaction volume of 20 UL. Mix by pipetting gently up and 
down. 
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Table 2 
Composition of Reaction Mix for Reverse-Transcription 
of Severe Acute Respiratory Syndrome-Coronavirus RNA 


Component Volume for one reaction (UL) 


5X first strand buffer 4 
dNTP mix (10 mM 

each dATP, dGTP, 

dCTP, and dTTP 
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Final concentration 


1X 


0.5 mM each 
2 U/uL 


20 U/UL 


Final concentration 


1X 


at neutral pH) | 
0.1 M dithiothreitol | 5 mM 
RNasin RNase inhibitor | 
(40 U/UL) 
SuperScript III reverse 2 
transcriptase (200 U/UL) 
Total volume 9 
Table 3 
Composition of Reaction Mix for Polymerase Chain Reaction Amplification 
Component Volume for one reaction (UL) 
Distilled water 19.5 
Advantage PCR buffer (10X) 2 
dNTP mix (10 mM each dATP, 0.5 


dGTP, dCTP, and dTTP 
at neutral pH) 
cDNA polymerase mix (50X) 05 


Total volume 23.0 


200 UM each 


1X 


5. Immediately transfer the tube from ice to the prewarmed 25°C thermocycler block 
for a 5-min incubation. Prewarm the other thermocycler block at 55°C. 
6. Transfer the tube to the prewarmed 55°C thermocycler block for a 1-h incubation. 


7. Heat inactivate at 72°C for 15 min. 


8. Add 1 uwL (2 U) of RNase H and incubate at 37°C for 20 min to remove RNA 


complementary to the cDNA. 


9. Dilute the product two- to fivefold with distilled water. Store at —20°C before use. 


3.4, Primary PCR Amplification 


1. Inside a hood dedicated for setting up PCR, assemble the PCR master mix for the 
50 reactions according to Table 3 with cDNA polymerase mix (see Note 6) ina 
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final reaction volume of 25 uwL. Add 50 aliquots of 23 UL into a 96-well PCR 
microplate. 

2. Add 5 pmol each of forward (PCR-F) and reverse (PCR-R) series of primers for 
each of the 50 reactions amplifying the overlapping amplicons that cover the whole 
SARS-CoV genome (see Note 7). The primer sequences are shown in Table 4. 

3. In an area separate from the hood dedicated for PCR, add 1 uL of diluted reverse- 
transcribed products. 

4. Commence with PCR in a thermocycler with initial denaturation at 95°C for 
1 min and 35 cycles of 95°C for 0.5 min, 55°C for 0.5 min, 68°C for 1.5 min, and 
a final extension at 68°C for 10 min. 


3.5. Secondary PCR Amplification 


1. Inside a hood dedicated for setting up PCR, assemble the PCR master mix for the 
50 reactions according to Table 3 in a final reaction volume of 25 uwL. Add 50 
aliquots of 23 WL into a new 96-well PCR microplate. 

2. Add 5 pmol each of forward (PCR-F) and reverse (BSEQ-R) series of primers for 
each of the 50 semi-nested PCR reactions. The primer sequences are shown in 
Table 4. 

3. In an area separate from the hood dedicated for PCR, add 1 uL of the correspond- 
ing primary PCR product. 

4. Commence PCR in a thermocycler with initial denaturation at 95°C for 1 min 
and 35 cycles of 95°C for 0.5 min, 55°C for 0.5 min, 68°C for 1.5 min, and a final 
extension at 68°C for 10 min. 

5. Electrophorese 5 uL of the secondary PCR product in a 2% agarose gel to verify 
the success of the PCR amplification. Estimate the amount of PCR product by 
comparison to DNA marker. Only products with single band should be used for 
sequencing. 


3.6. Cycle Sequencing 


Perform sequencing reaction based on the dideoxy dye terminator method, 
according to manufacturers’ instructions: 


1. Separate from the hood dedicated for PCR, assemble the cycle sequencing reac- 
tion with ASEQ-F, BSEQ-F, ASEQ-R, and BSEQ-R series of oligonucleotides 
as sequencing primers for each of the amplicon, and with 2—5 ng of secondary 
PCR product as sequencing template (see Note 8). 

2. Commence with cycle sequencing reaction in a thermocycler. 

3. Purify the extension products with either spin column purification or ethanol pre- 
cipitation. Mix or resuspend the DNA in formamide solution according to the 
manufacturer’s instructions. 

4. Denature the purified extension products at 95°C for 5 min, snap-cool on ice, and 
load onto the automated capillary DNA sequencer for injection. 
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Table 4 


Primer Sequences 


01PCR-F 
01ASEQ-F 
01ASEQ-R 
01BSEQ-F 
01BSEQ-R 
01PCR-R 

02PCR-F 

02ASEQ-F 
02ASEQ-R 
02BSEQ-F 
02BSEQ-R 
02PCR-R 

03PCR-F 

03ASEQ-F 
03ASEQ-R 
03BSEQ-F 
03BSEQ-R 
03PCR-R 

04PCR-F 

04ASEQ-F 
04ASEQ-R 
04BSEQ-F 
04BSEQ-R 
04PCR-R 

05PCR-F 

05ASEQ-F 
0S5ASEQ-R 
05BSEQ-F 
05BSEQ-R 
O5PCR-R 

06PCR-F 

06ASEQ-F 
06ASEQ-R 
06BSEQ-F 
06BSEQ-R 
06PCR-R 

O7PCR-F 

07ASEQ-F 
07ASEQ-R 
07BSEQ-F 


CTACCCAGGAAAAGCCAACCAACCT 
AAAGCCAACCAACCTCGATC 
AAGTGCCATTTTTGAGGTGT 
TTGCCTGTCCTTCAGGTTAG 
GTCACCTAAGTCATAAGACT 
TGCCAAGCTCGTCACCTAAGTCATA 
TACCGCAATGTTCTTCTTCGTAAGA 
TTCTTCTTCGTAAGAACGGT 
GCTCGTAGCTCTTATCAGAG 
CAACTTGATTACATCGAGTC 
TTCAGTGCCACAATGTTCAC 
TAACTAAATTTTCAGTGCCACAATG 
TCTACCTTGATGGGGTGTAATCATT 
TGAAATCTAATCATTGCGAT 
GGAGATCCTCATTCAAGGTC 
AATAAGCGTGCCTACTGGGT 
AATTGATCTGATAACACCAG 
TGCGCGCAAAAATTGATCTGATAAC 
AAAGGTGCTTGGAACATTGGACAAC 
GGAACATTGTACAACAGAGA 
ATTTGAGAATCTCCCAAGCA 
GGCACTACTGTTGAAAAACT 
ATGTGAATCACCTTCAAGAA 
GTACTGTGTCATGTGAATCACCTTC 
CGTCAGTGTATACGTGGCAAGGAGC 
TACGTGGCAAGGAGCAGCTG 
CAACACGTTCATCAAGCTCA 
GGTGCACCAATTAAAGGTGT 
ACAGGTTTCATCAATTTCTT 
ACTCATGTTCACAGGTTTCATCAAT 
TCATCACGTATGTATTGTTCCTTTT 
TGTATTGTTCCTTTTACCCT 
CTGCTACACCACCACCATGT 
ATTAAATGTGTTGACATCGT 
AACCGTCTGCACGCACACTT 
CCTGTGTACGAACCGTCTGCACGCA 
TCACAGGACATCTTACTTGCACCAT 
TCTTACTTGCACCATTGTTG 
CTTCACCTCTAAGCATGTTC 
ACACTGGAAGAAACTAAGTT 
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Table 4 (Continued) 
Primer Sequences 


07BSEQ-R 
O7PCR-R 
O8PCR-F 
O8ASEQ-F 
O8ASEQ-R 
O8BSEQ-F 
O8BSEQ-R 
O8PCR-R 
09PCR-F 
09ASEQ-F 
09ASEQ-R 
09BSEQ-F 
09BSEQ-R 
09PCR-R 
10PCR-F 
10ASEQ-F 
10ASEQ-R 
10BSEQ-F 
10BSEQ-R 
10PCR-R 
11PCR-F 
11ASEQ-F 
11ASEQ-R 
11BSEQ-F 
11BSEQ-R 
11PCR-R 
12PCR-F 
12ASEQ-F 
12ASEQ-R 
12BSEQ-F 
12BSEQ-R 
12PCR-R 
13PCR-F 
13ASEQ-F 
13ASEQ-R 
13BSEQ-F 
13BSEQ-R 
13PCR-R 
14PCR-F 
14ASEQ-F 


TACAGTTCCTAGAATCTCTT 
AATTCCAGGATACAGTTCCTAGAAT 
GCTAAGACTGCTCTTAAGAAATGCA 
CTCTTAAGAAATGCAAATCT 
CTACGGCAGGAGCTTTAAGA 
AATGAGCCGCTTGTCACAAT 
CACTTTTATAGTCTTAACCT 
CAGTTGTGAACACTTTTATAGTCTT 
CCCGTCGAGTTTCATCTTGACGGTG 
TTCATCTTGACGGTGAGGTT 
AATTGTTATCAGCCCATTTA 
AGTTTTCTTGGTAGGTACAT 
ATACATCACAGCTTCTACAC 
GAGTACCCATATACATCACAGCTTC 
TTGGAATCTGCAAAGCGAGTTCTTA 
CAAAGCGAGTTCTTAATGTG 
TGTAAGATGTTTCCTTGTAG 
GCTAAGGAGACCCTCTATCG 
AATAGCCACTACATCGCCAT 
GTCTATAGTCAATAGCCACTACATC 
TTAAATCAAATGACAGGCTTCACAA 
TGACAGGCTTCACAAAGCCA 
TGCCTACAACTTCGGTAGTT 
TGTGAAAGTCAACAACCCAC 
AGGCATATAATTGTTAAACA 
TAAACACATAAGGCATATAATTGTT 
TATGTCAAACCATTCTTAGGACAAG 
CATTCTTAGGACAAGCAGCA 
ACGAATTAAGATACAATTCT 
GGTTCTCTAATCTGTGTAAC 
ACTAATGATAAACCACATGA 
TTTGTACAATACTAATGATAAACCA 
TTAGGTCTTTCAGCTATAATGCAGG 
CAGCTATAATGCAGGTGTTC 
ACAAATCACGAGCAACTTCA 
GGCCGTGGCTTCTGCAAGAC 
CAGAATAGGTTGGCACATCA 
GGTCAAGCAACAGAATAGGTTGGCA 
ATAGTTTTTGATGGCAAGTCCAAAT 
ATGGCAAGTCCAAATGCGAC 


(continued) 


186 


Table 4 (Continued) 
Primer Sequences 


14BSEQ-F 
14BSEQ-R 
14ASEQ-R 
14PCR-R 
15PCR-F 
15ASEQ-F 
15BSEQ-F 
14BSEQ-R 
14ASEQ-R 
14PCR-R 
15PCR-F 
15ASEQ-F 
15BSEQ-F 
15BSEQ-R 
15ASEQ-R 
15PCR-R 
16PCR-F 
16ASEQ-F 
16ASEQ-R 
16BSEQ-F 
16BSEQ-R 
16PCR-R 
17PCR-F 
17ASEQ-F 
17ASEQ-R 
17BSEQ-F 
17BSEQ-R 
17PCR-R 
18PCR-F 
18ASEQ-F 
18ASEQ-R 
18BSEQ-F 
18BSEQ-R 
18PCR-R 
19PCR-F 
19ASEQ-F 
19ASEQ-R 
19BSEQ-F 
19BSEQ-R 
19PCR-R 


GTTGTTGATACCGATGTTGA 
AAAACAAGTACTAACAATCT 
TGACAGTTGTAACAATTTCA 
GCATAAGTTTAAAACAAGTACTAAC 
AGACTAACTTGTGCTACAACTAGAC 
GTGCTACAACTAGACAGGTT 
GCGTGGTGGTTCATACAAAAA 
AAAACAAGTACTAACAATCT 
TGACAGTTGTAACAATTTCA 
GCATAAGTTTAAAACAAGTACTAAC 
AGACTAACTTGTGCTACAACTAGAC 
GTGCTACAACTAGACAGGTT 
CGTGGTGGTTCATACAAAAA 
CTCCAGGTAAGTGTTAGGAA 
GCACAGTACCCGGTAAGCCA 
TAACAGAACCCTCCAGGTAAGTGTT 
GGTTCTATTTCTTATGGTGAGCTTC 
CTTATAGTGAGCTTCGTCCA 
ACTCACCAAAAACACGTCTG 
TTAGATGTGTCTGCTTCAGT 
AACTCCATTAAACATGACTC 
TACTAAATGTAACTCCATTAAACAT 
ACAGCAATCTATGTATTCTGTATTT 
ATGTATTCTGTATTTCTCTG 
CTGACGGGAATGCCATTTTC 
TTTAGCAACTCAGGTGCTGA 
TTGGATACGGACAAATTTAT 
TTTGACCAGGTTGGATACGGACAAA 
ATTGGCCATTCTATGCAAAATTGTC 
CTATGCAAAATTGTCTGCTT 
CAGCATACAGCCATGCCAAA 
GAAGGTAAATTCTATGGTCC 
ACCTTGGAAGGTAACACCAG 
TCTTGAACTTACCTTGGAAGGTAAC 
GGTCGTACTATCCTTGGTAGCACTA 
TCCTTGGTAGCACTATTTTA 
AGCTAGTGTCAGCCAATTCA 
TTACCTTCTCTTGCAACAGT 
AAATAACAATGGGTAATACT 
TGCCAGTAATAAATAACAATGGGTA 
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Table 4 (Continued) 
Primer Sequences 


20PCR-F 
20ASEQ-F 
20ASEQ-R 
20BSEQ-F 
20BSEQ-R 
20PCR-R 
21PCR-F 
21ASEQ-F 
21ASEQ-R 
21BSEQ-F 
21BSEQ-R 
21PCR-R 
22PCR-F 
22 ASEQ-F 
22ASEQ-R 
22BSEQ-F 
22BSEQ-R 
22PCR-R 
23PCR-F 
23 ASEQ-F 
23ASEQ-R 
23BSEQ-F 
23BSEQ-R 
23PCR-R 
24PCR-F 
24 ASEQ-F 
24ASEQ-R 
24BSEQ-F 
24BSEQ-R 
24PCR-R 
25PCR-F 
25ASEQ-F 
25ASEQ-R 
25BSEQ-F 
25BSEQ-R 
25PCR-R 
26PCR-F 
26ASEQ-F 
26ASEQ-R 
26BSEQ-F 


ACCTCTAACTATTCTGGTGTCGTTA 
ATTCTGGTGTCGTTACGACT 
AGAGCAGTACCACAGATGTG 
AACATTAAGTTGTTGGGTAT 
ATTAGCTACAGCCTGCTCAT 
CAGAATCACCATTAGCTACAGCCTG 
CTTCAGGCTATTGCTTCAGAATTTA 
TTGCTTCAGAATTTAGTTCT 
CAACAACCATGAGTTTGGCT 
GATAATGATGCACTTAACAA 
GGCAAGTGCATTGTCATCAG 
TGTTATAGTAGGCAAGTGCATTGTC 
AATAATGAACTGAGTCCAGTAGCAC 
TGAGTCCAGTAGCACTACGA 
CTGCAAAAGCACAGAAGGAA 
ATGGTGCTGGGCAGTTTAGC 
ACAGACTGTGTTTCTAAGTG 
CGCAGACGGTACAGACTGTGTTTCT 
GGATTCTGTGACTTGAAAGGTAAGT 
GACTTGAAAGGTAAGTACGTC 
TGTTGGTAGTTAGACATAGT 
TAAAAACTAATTGCTGTCGC 
CTAAGTTAGCATATACGCGT 
ACACGCTCACCTAAGTTAGCATATA 
ACAATTGCTGTGATGATGATTATTT 
TGATGATGATTATTTCAATA 
AGACAAAGTCTCTCTTCCGT 
GGGCATTGGCTGCTGAGTCC 
CTGGATCAGCAGCATACACT 
GCATGCATAGCTGGATCAGCAGCAT 
GTGAGTTAGGAGTCGTACATAATCA 
AGTCGTACATAATCAGGATG 
CAATCAAAGTATTTATCAAC 
CTATCAGTGATTATGACTAT 
TTGACTTCAATAATTTCTGA 
GTGGCGGCTATTGACTTCAATAATT 
GTGCAAAGAATAGAGCTCGCACCGT 
TAGAGCTCGCACCGTAGCTG 
GATGATGTTCCACCTGGTTT 
CACACCGTTTCTACAGGTTA 
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Table 4 (Continued) 
Primer Sequences 


26BSEQ-R 
26PCR-R 
27PCR-F 
27 ASEQ-F 
27ASEQ-R 
27BSEQ-F 
27BSEQ-R 
27PCR-R 
28PCR-F 
28ASEQ-F 
28ASEQ-R 
28BSEQ-F 
28BSEQ-R 
28PCR-R 
29PCR-F 
29 ASEQ-F 
29ASEQ-R 
29BSEQ-F 
29BSEQ-R 
29PCR-R 
30PCR-F 
30ASEQ-F 
30ASEQ-R 
30BSEQ-F 
30BSEQ-R 
30PCR-R 
31PCR-F 
31ASEQ-F 
31ASEQ-R 
31BSEQ-F 
31BSEQ-R 
31PCR-R 
32PCR-F 
32 ASEQ-F 
32ASEQ-R 
32BSEQ-F 
32BSEQ-R 
32PCR-R 
33PCR-F 
33ASEQ-F 


TGCTAGCTACTAAACCTTGA 
AAGTTCTTAATGCTAGCTACTAAAC 
TGCGTAAACATTTCTCCATGATGAT 
TTTCTCCATGATGATTCTTT 
TGAAAGACATCAGCATACTC 
AAACAGATGGTACACTTATG 
GATTAACAGACAACACTAAT 
CAAACATAGGGATTAACAGACAACA 
GTGCCTGTATTAGGAGACCATTCCT 
TAGGAGACCATTCCTATCTT 
CCATATGACAGCTTAAATGT 
CTGGCGATTACATACTTGCC 
CATAGTGCTCTTGTGGCACT 
GTAATTCTCACATAGTGCTCTTGTG 
ACAAGTTGAATGTTGGTGATTACTT 
TGTTGGTGATTACTTTGTGT 
GAATTCACTTTGAATTTATC 
TATGTGAAAAGGCATTAAAA 
CAGCAGGACAACGGCGACAA 
TCAACAATTTCAGCAGGACAACGGC 
TAGAACCAGAATATTTTAATTCAGT 
ATATTTTAATTCAGTGTGCA 
GTAGTTTGTGTGAATATGAC 
CACAGAACGCTGTAGCTTCA 
GTATGCCTGGTATGTCAACA 
ATGTCCTTTGGTATGCCTGGTATGT 
CTGGTCTTCATCCTACACAGGCACC 
TCCTACACAGGCACCTACAC 
AGATGTTTAAACTGGTCACC 
ACTTAGTAGCTGTACCGACT 
GCTGAACATCAATCATAAAT 
AAGCCCCACTGCTGAACATCAATCA 
GCTTTTCTACTTCATCAGATACTTA 
AAGTAGTCTATGAATACGGA 
TTCCATTCTACTTCAGCCTG 
TTGTGAAGTCTGCATTGCTT 
AAAAGAAAGGCAATTGCTTT 
TCAGAATAGTAAAAGAAAGGCAATT 
GTGGTAGTTTGTATGTGAATGGGCA 
GTATGTGAATAAGCATGCAT 


Chim, Chiu, and Lo 


(continued) 


Genomic Sequencing of SARS-CoV 189 


Table 4 (Continued) 
Primer Sequences 


33ASEQ-R 
33BSEQ-F 
33BSEQ-R 
33PCR-R 
34PCR-F 
34 ASEQ-F 
34 ASEQ-R 
34BSEQ-F 
34BSEQ-R 
34PCR-R 
35PCR-F 
35ASEQ-F 
35ASEQ-R 
35BSEQ-F 
35BSEQ-R 
35PCR-R 
36PCR-F 
36ASEQ-F 
36ASEQ-R 
36BSEQ-F 
36BSEQ-R 
36PCR-R 
37PCR-F 
37 ASEQ-F 
37 ASEQ-R 
37BSEQ-F 
37BSEQ-R 
37PCR-R 
38PCR-F 
38ASEQ-F 
38ASEQ-R 
38BSEQ-F 
38BSEQ-R 
38PCR-R 
39PCR-F 
39 ASEQ-F 
39ASEQ-R 
39BSEQ-F 
39BSEQ-R 
39PCR-R 


GCTTCGCCGGCGTGTCCATC 
CTTATAACCTGTGGAATACA 
GTGAAGAACAAGCACTCTCA 
AAGACAGTAAGTGAAGAACAAGCAC 
AAAGAGAAGCCCCAGCACATGTATC 
CCCAGCACATGTATCTACAA 
ATGAATTCATCCATAGCGAG 
TGCCTGAAACCTACTTTACT 
TGACCACTTTTGAAATCACT 
ATTGTAACCTTGACCACTTTTGAAA 
AATGTGTGTGTTCTGTGATTGATCT 
TTCTGTGATTGATCTTTTAC 
TTATCAGAGCCAGCACCAAA 
ATGTCGCAAAGTATACTCAA 
CTGTTATCTTTACAGCTATA 
CAAGAATGCTCTGTTATCTTTACAG 
ATGACTCTAAAGAAGGGTTTTTCAC 
AGAAGGGTTTTTCACTTATC 
TCCAGAAGAGAATAAATCAT 
CACTCTTTGACATGAGCAAA 
TAGTATGAAACCCTGTAACA 
GTATGATTAATAGTATGAAACCCTG 
ATCCTGATGAAATTTTTAGATCAGA 
AATTTTTAGATCAGACACTC 
TTTTCTGAAACATCAAGCGA 
AACCCATGGGTACACAGACA 
TTGTACCATTTTCATCATAC 
GCATCTGTGATTGTACCATTTTCAT 
AAGACATTTGGGGCACGTCAGCTGC 
GGGCACGTCAGCTGCAGCCT 
TTCAACTTAGTGGCAGAAAC 
GAAAAAAAATTTCTAATTGT 
GAGCAGGTGGGGTGCAAGGT 
TAACAATTAAGAGCAGGTGGGGTGC 
ATCTTAGACATGGCAAGCTTAGGCC 
TGGCAAGCTTAGGCCCTTTG 
GGTGAAATGTCTAATATTTC 
CAAAGAGATTTCAACCATTT 
TTTGGCTAGTACTACGTAAT 
ACAATAGATTTTTGGCTAGTACTAC 
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Table 4 (Continued) 
Primer Sequences 


40PCR-F 
40ASEQ-F 
40ASEQ-R 
40BSEQ-F 
40BSEQ-R 
40PCR-R 
41PCR-F 
41ASEQ-F 
41ASEQ-R 
41BSEQ-F 
41BSEQ-R 
41PCR-R 
42PCR-F 
42 ASEQ-F 
42ASEQ-R 
42BSEQ-F 
42BSEQ-R 
42PCR-R 
43PCR-F 
43ASEQ-F 
43ASEQ-R 
43BSEQ-F 
43BSEQ-R 
43PCR-R 
44PCR-F 
44ASEQ-F 
44ASEQ-R 
44BSEQ-F 
44BSEQ-R 
44PCR-R 
45PCR-F 
45ASEQ-F 
45ASEQ-R 
45BSEQ-F 
45BSEQ-R 
45PCR-R 
46PCR-F 
46ASEQ-F 
46ASEQ-R 
46BSEQ-F 


TCGACACTTCTTATGAGTGCGACAT 
TTATGAGTGCGACATTCCTA 
GTTGGGGTTTTGTACATTTG 
GCACACAACTAAATCGTGCA 
CACCAAATGTCCATCCAGCA 
GCAGGCCAGCACCAAATGTCCATC 
TGTTGCCACCTCTGCTCACTGATGA 
TCTGCTCACTGATGATATGA 
ATTTGTACCTCCGCCTCGAC 
ACACACTTGTTAAACAACTT 
GGAAGTATGCTTTGCCTTCA 
CCTTCACGAGGGAAGTATGCTTTGC 
TTGTCTTCCTACATGTCACGTATGT 
ACATGTCACGTATGTGCCAT 
AAATTTTTAGCGACCTCATT 
CATCACCAGATGTTGATCTT 
ATTGATCCAAGAGTAAAAAA 
CTGTGCAGTAATTGATCCAAGAGTA 
ACTCTGAGCCAGTTCTCAAGGGTGT 
AGTTCTCAAGGGTGTCAAAT 
TTGTAGAAAATATATCAAGG 
ACTGCTGCTATTTGTTACCA 
TGGTAGTAAACTTCGGTGAA 
AGACTCAAGCTGGTAGTAAACTTCG 
CTACCAAATTGGTGGTTATTCTGAG 
GGTGGTTATTCTGAGGATAG 
GATGGCTAGTGTGACTAGCA 
TATGTACTCATTCGTTTCGG 
AAATTGTAGTAACATAATCC 
TAGAATAGGCAAATTGTAGTAACAT 
ATTACCGTTGAGGAGCTTAAACAAC 
AGGAGCTTAAACAACTCCTG 
CAATGACAAGTTCACTTTCC 
TCAATGTGGTCATTCAACCC 
ATTGTAACCTGGAAGTCAAC 
TATCTCTGCTATTGTAACCTGGAAG 
ACAGACCACGCCGGTAGCAACGACA 
CCGGTAGCAACGACAATATT 
GGGTGAAATGGTGAATTGCC 
GCGAGCTATATCACTATCAG 
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Table 4 (Continued) 
Primer Sequences 


46BSEQ-R 
46PCR-R 
4ATPCR-F 
47ASEQ-F 
47ASEQ-R 
47BSEQ-F 
47TBSEQ-R 
47PCR-R 
4A8PCR-F 
A8ASEQ-F 
4ASASEQ-R 
48BSEQ-F 
48BSEQ-R 
48PCR-R 
49PCR-F 
49ASEQ-F 
49ASEQ-R 
49BSEQ-F 
49BSEQ-R 
49PCR-R 
50PCR-F 
50ASEQ-F 
50PCR-R 


ATTAAAACAAGGAATAGCAG 
AATAAGCATTATTAAAACAAGGAAT 
TCACCATTAAGAGAAAGACAGAATG 
GAGAAAGACAGAATGAATGA 
GGATCTTGACAGTTGATAGT 
CTGCTTGGCTTTGTGCTCTA 
CTTGCCATGCTGAGTGAGAG 
TAAGTTCCTCCTTGCCATGCTGAGT 
CGCAATGGGGCAAGGCCAAAACAGC 
CAAGGCCAAAACAGCGCCGA 
TTCCTTGAGGAAGTTGTAGC 
TGGGTTGCAACTGAGGGAGC 
TTTTTGGCGAGGCTTTTTAG 
TGGCAGTACGTTTTTGGCGAGGCTT 
AGCAAAGTTTCTGGTAAAGGCCAAC 
CTGGTAAAGGCCAACAACAA 
GTGGGAATGTTTTGTATGCG 
ACTTATCATGGAGCCATTAA 
CTACTTGTGCTGTTTAGTTA 
TTAACTAAACCTACTTGTGCTGTTT 
TGGGCTATGTAAACGTTTTCGCAAT 
AAACGTTTTCGCAATTCCGT 
TTACACATTAGGGCTCTTCCATATAGG 


Sequences for primary (PCR-F and PCR-R) and secondary (PCR-F and BSEQ-R) PCR prim- 
ers. ASEQ-F, BSEQ-F, ASEQ-R, BSEQ-R are the sequencing primers. The sequences are from 


5 103° 


3.7. Sequence Comparison 


1. Edit, align, and compare sequences using the Tor2 strain (GenBank accession 
number AY274119) as a reference with the software designed for this purpose, 
for example SeqScape (see Note 9). 

2. Re-sequence regions that reveal nucleotide substitutions using a combination of 
different primer sets to ensure the quality of the sequencing data (see Note 10). 


4. Notes 


1. For RNA extraction, carrier poly(A) RNA is added to the lysis buffer to increase 
the yield. Because the PCR primers are specific to the SARS-CoV genome, the 
subsequent amplification would not be affected. However, if one wants to per- 
form 3' rapid amplification of cDNA ends (3' RACE) or similar cloning opera- 
tion that depends on oligo(dT) priming of poly(A) tail of the viral RNA, then the 
carrier poly(A) RNA should be avoided. 
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2. Our previous studies have shown that, with two rounds of PCR, even direct clini- 
cal samples can be sequenced. This obviates the need for viral culture, which 
may pose a health hazard if not handled properly. It also minimizes the possible 
generation of viral mutants through culturing. However, direct clinical samples 
of high viral titer and a sensitive PCR amplification are required. 

3. Yields of viral RNA should be determined by quantitative RT-PCR, because spec- 
trophotometric determination is prone to error as a result of low RNA quantity 
and interference by the carrier poly(A) RNA, which contributes to most of the 
RNA present. 

4. A prolonged denaturation step is used to remove secondary RNA structures in 
the SARS-CoV genome that impede reverse-transcription. The use of random 
hexamer ensures an even representation of the whole RNA genome and allows 
more sequence information to be obtained from a limited amount of viral RNA. 

5. We recommend the use of a reverse transcriptase with increased thermal stabil- 
ity, which facilitates reverse-transcription at a higher temperature (55°C) than 
normal (42°C). This unfolds some of the secondary RNA structures, and thus 
produces longer cDNA at higher yields. 

6. We recommend the simultaneous use of two different DNA polymerases in the 
PCR amplification. For example, the cDNA polymerase mix that we use contains 
KlenTag-1 DNA polymerase, and a second DNA polymerase with 3' to 5' proof- 
reading activity. The inclusion of a minor amount of a proofreading polymerase 
results in an error rate that is significantly lower than that for Tag alone (18). This 
advantage 1s obvious when one is concerned about genomic sequence variations 
between different viral strains. The use of a two-polymerase system also increases 
the efficiency and yield, and hence the sensitivity, which is important when the 
viral titer is suboptimal. 

7. The carryover of unused PCR primers into the sequencing reaction would lead to 
poor sequencing results. Like the sequencing primers, these unused PCR primers 
would also bind nonspecifically to the sequencing template in the cycle sequenc- 
ing reaction, and, hence, generate noisy sequencing traces overshadowing the 
intended traces. Purification of the PCR products is, thus, usually recommended 
prior to their use as sequencing templates. However, these methods are labor- 
intensive and pose extra contamination risk, as they involve additional steps of 
opening and handling PCR products. Notably, we have suggested an optimized 
PCR protocol for direct sequencing of PCR products without PCR product purifi- 
cation. With the low PCR primer concentrations and the optimal number of cycles, 
most of the PCR primers are consumed at the end of the PCR. Furthermore, a 
nested sequencing primer selectively extends the specific PCR product in the 
cycle sequencing reaction. This would suppress any nonspecific PCR product 
from extension. The combined effect is a neat sequencing trace. 

8. The amount of PCR product used for the sequencing reaction must be optimized 
carefully with different sequencing systems. Although more PCR product input 
usually gives higher signal intensities, 1t may also give shorter read lengths and 
oversaturated signals. 
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9. 


10. 


The PCR primers target 50 700-bp amplicons that overlap with each other along 
the SARS-CoV genome. The sequencing primers are designed in such a way that 
any sequence masked over by the PCR primer binding sites and the sequencing 
primer peak on one amplicon are reliably backed up by the homologous sequence 
in the overlapping amplicon. 

We advocate scrutinizing efforts in validating any genomic sequence variation 
by resequencing regions with different combination of primers and sequencing 
chemistry. Because variation seen in a single viral isolate could potentially be a 
result of sequencing artifacts, we consider only the genomic sequence variations 
that are shared by at least two SARS-CoV isolates. 
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